Working with the Prizm Platform Services > Prizm Services How To's > Implementing Prizm Services Caching Strategies |
The power behind Prizm Services’ ability to deliver viewable web content quickly and efficiently lies with its cache management. Viewing a multipage document requires that each document page be converted into a web compatible format such as JPEG, PNG or ideally SVG (which gives the highest fidelity upon scaling). Unfortunately, the conversion process is not instantaneous which means there is some delay before a page can be made viewable. Because Prizm Services assumes a document will be viewed by more than one person over multiple sessions, it converts all the pages into web viewable intermediate objects that are stored in its cache folders.
The conversion process begins when the viewing session is started or with the first request to view a document page by a given viewing session. Typically, the viewable page data that is generated will then be made available to any subsequent request for the same pages, reducing the time to view to only the time it takes to download the page data to the browser. To summarize, the cached files help deliver viewing performance because the viewing objects are pre-generated and stored in the cache folders.
The cached files require storage on some media device for some period of time. Cached files created for viewing may take up a considerable amount of space, so there is a need to have some control on the growth of the cache files. Fortunately, Prizm Services does provide ways to deal with the storage usage demand of the cache with options for controlling both where the files are stored, and how long they are stored there. In fact, the cache contains different purposed folders which can be relocated to different devices which can spread the cache burden out to different devices if necessary.
The majority of the Prizm Services cache is made up of pre-generated document pages which are readily available on demand. Caching these files is already a help in performance when the same document is viewed repeatedly. While there are three configurable cache folders locations, placing certain ones on more responsive media can result in better viewing experience with less burden on the server hosting the Prizm Services service. The use of solid state drives (SSD) or Shared Memory (Linux only) minimizes input/output (I/O) latency and access times for cached files but these storage devices are typically much more confined in storage capacity.
Several scenarios are proposed below with purposed cache configuration solutions. The user should be familiar with the Prizm Services pcc.config file settings as outlined in Prizm Services Configuration Options. Along with the pcc.config file, there is a property in the JSON object which the application posts when requesting a new viewing session from Prizm Services (refer to the Prizm Services sample and the How To Adjust Caching Parameters for Prizm Services topic).
The default settings in the pcc.config file will cause viewing sessions to timeout after 20 mins, and cached files to expire after one day. Also by default, the Prizm Services cache folders will all be created within the same parent directory on the root drive. These default settings give a reader 20 minutes to read a document once the viewing session is started. After that time period, a new viewing session will need to be created for them to continue reading the document, either by refreshing their browser, or another mechanism you implement in your application.
The next time the same document is viewed, Prizm Services will simply deliver the viewing objects that were created in the first viewing session to the same reader, or to any other reader viewing the same document, for about 24 hours after the first viewing session was created. When a reader (same or new) requests to read the document a day later, the cache process starts over because PCC will have already deleted the cached pages and will have to re-generate all the viewable content of the document again.
Viewing response appears slow even with caching enabled as lots of readers are interested in viewing the document.
Set the GroupStateFolder setting in the pcc.config file to a faster SSD device or with Linux environments, set the content to a folder of the Shared Memory device (i.e. /dev/shm). The other cache folders noted in pcc.config, DocumentPath and TempcachePath, could benefit too if they were placed onto faster storage devices.
Example for Shared Memory Device |
Copy Code
|
---|---|
<GroupStateFolder>/dev/shm/Accusoft/Prizm/GroupState</GroupStateFolder> <DocumentPath>/dev/shm/Accusoft/Prizm/DocumentCache</DocumentPath> <TempcachePath>/dev/shm/Accusoft/Prizm/Cache</TempcachePath> |
The above settings in pcc.config set the cache directories to folders in Shared Memory on a Linux OS environment. Being faster than standard disk drives, Prizm Services response will be typically quicker with less overall stress on the server to deliver viewing content.
Viewers are getting errors and the storage device used for the Prizm Services cache is showing errors because the devices are full.
Depending on available storage capacity of the selected device, the cache expiration period specified by CacheExpirationPeriod in pcc.config may need to be shortened to accommodate cache load. Please note that the time period for CacheExpirationPeriod should not be any shorter than the ViewingSessionTimeout time period. Otherwise, the ViewingSessionTimeout will take precedence and the cache expiration period will be forced to the same value. The ViewingSessionTimeout time period can be shortened but at the penalty of reducing the amount of time a user has to read a document in a single viewing session.
Rather than changing the viewing session timeout period, try changing the size of the (fast) storage device. If not practical to change device storage device size, try moving the directory specified TempcachePath to a different storage device and if that isn’t enough do the same for DocumentPath. Splitting cache folders to different dedicated storage devices can benefit performance by reducing disk latency for Hard Disk Drives (HDD) compared to having one HDD serving all the viewing sessions.
Example for Quicker Cache Cleanup |
Copy Code
|
---|---|
<CacheExpirationPeriod>20m</CacheExpirationPeriod> <ViewingSessionTimeout>15m</ViewingSessionTimeout> |
The above settings set the viewing session timeout to 15 minutes and the life expectancy of any cached file to 20 minutes. After approximately 35 to 45 minutes, the cached files for a given document will be deleted. The exact time of cleanup can vary based on the scheduled nature of the cleanup processes and current load on the server.
Your application views a lot of large documents and users are not able to read them in time before they get a viewing session timeout error.
The default setting in the pcc.config file for ViewingSessionTimeout is 20 minutes. It can be increased to a larger value but that means Prizm Services will have more resources to track at any given moment which could affect performance and host server capacity.
Example of Longer Viewing Session Duration |
Copy Code
|
---|---|
<ViewingSessionTimeout>1h</ViewingSessionTimeout> <CacheExpirationPeriod>1d</CacheExpirationPeriod> |
The above settings increase the ability for users to peruse a given document for an hour. Cache resources for the document will be removed 25+ hours later. As above, there is variability for cache cleanup based on the scheduled nature of the cleanup processes and current load on the server.
The documents served are fairly random and not typically shared with others.
- Or -
The image is watermarked uniquely for each viewer and should not be shared.
In this scenario, the cache resources are not likely to be needed except for the initial user. There is a property in the JSON object which the application posts when requesting a new viewing session from Prizm Services that can be used to disable caching on a per-viewing-session basis. The property, serverCaching, should be set explicitly to the string value none when the application requests a POST operation to get a new viewing session ID. Each document uploaded to Prizm Services will be converted without Prizm Services looking for an existing copy of the document. After the viewing session times out, the cached items for the document will be removed on a predetermined schedule which should be fairly quick because no other viewing sessions are using the data. For example:
Example |
Copy Code
|
---|---|
POST /ViewingSession { ... "serverCaching": "none", ... } |
After the viewing session timeout, the cache items should be removed fairly soon.
The Prizm Services cache provides a mechanism to deliver document content in a timely matter. However, each application is different and may tax server resources differently or have more demanding requirements. Balancing resource constraints against user experience can be a difficult task that may require compromises. Faster hardware, more specifically high speed storage devices, coupled with an understanding of the options for adjusting how the Prizm Services cache behaves should allow you to reach a desired level of performance while maintaining a good user